Background
Microbiome sample clustering can be performed using either model-based methods and machine learning methods. Machine learning methods, which rely on defined distance metrics, are used more frequently than model-based statistical methods (“due to their efficient implementation and easy interpretation.”)
- I used the partition around medoids (PAM) clustering method, which is related to but considered more robust than K-means. In contrast to K-means, which can be sensitive to the effects of outliers, PAM’s optimization goal is to minimize the sum of distances to the medoids instead of minimizing the sum of the squared distances to the cluster centers.
Note: clustering was performed directly on distance matrices, not ordinations or ordination scores! This was done to eliminate the introduction of artifacts arising from clustering ordination scores.
Methods
Partitioning methods, such as PAM clustering require the number of clusters to be generated as an input value. Here we are viewing the gap statistic as calculated by the function cluster::clusGap(). The optimal number of clusters was specified using the “Tibs2001SEmax” method, Tibshirani et al’s recommendation of determining the number of clusters from the gap statistics and their standard deviations.
clusGap() calculates a goodness of clustering measure, the “gap” statistic. For each number of clusters \(k\), it compares \(log(W(k))\) with \(E∗[log(W(k))]\) where the latter is defined via bootstrapping, i.e., simulating from a reference (\(H_0\)) distribution, a uniform distribution on the hypercube determined by the ranges of x, after first centering, and then svd (aka ‘PCA’)-rotating them when (as by default) spaceH0 = "scaledPCA".
\(k\) such that its value \(f(k)\) is not more than 1 standard error away from the first local maximum. This is similar but not the same as maxSE(f, SE.f) determines the location of the maximum of f, taking a “1-SE rule” into account for the *SE* methods. The default method "firstSEmax" looks for the smallest"Tibs2001SEmax", Tibshirani et al’s recommendation of determining the number of clusters from the gap statistics and their standard deviations.
References
Tibshirani, R., Walther, G. and Hastie, T. (2001). Estimating the number of data clusters via the Gap statistic. Journal of the Royal Statistical Society B, 63, 411–423.
Tibshirani, R., Walther, G. and Hastie, T. (2000). Estimating the number of clusters in a dataset via the Gap statistic. Technical Report. Stanford.
Dudoit, S. and Fridlyand, J. (2002) A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology 3(7). [doi:10.1186/gb-2002-3-7-research0036\\](doi:10.1186/gb-2002-3-7-research0036\
Per Broberg (2006). SAGx: Statistical Analysis of the GeneChip. R package version 1.9.7. https://bioconductor.org/packages/3.12/bioc/html/SAGx.html Deprecated and removed from Bioc ca. 2022
We found the optimal number of clusters for the Sørensen-based dissimilarities of the fungal communities to be 10.
We found the optimal number of clusters for the Sørensen-based dissimilarities of the bacterial communities to be 14.
Note the density of cluster 1 (now F1) - I’ll investigate that further below. Fungal clusters have now been relabeled in an intuitive order based on decreasing similiarities to F1.
We do see clusters with only 1 site and others with multiple sites.
| Df | SumOfSqs | R2 | F | Pr(>F) | |
|---|---|---|---|---|---|
| Fun_sor_clusters | 9 | 56.9 | 0.34 | 27.12 | 0.001 |
| Residual | 474 | 110.5 | 0.66 | NA | NA |
| Total | 483 | 167.4 | 1.00 | NA | NA |
Across the whole dataset, the fungal clustering explained 33.99% of the community variability (presence/absence).
TBD
Across the whole dataset, the bacterial clustering explained
33.99% of the community
variability (presence/absence).
Note that in fungal cluster “F2” there were only 10 samples (all site=SFA, grass=SCSC), and because enviromental data was measured on a pooled site-by-grass bases, no underlying variability exists to perform stats on.
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wholecommunity.pH_m.std | 1 | 5.766 | 0.034 | 17.188 | 0.001 | Fun_wholecommunity | pH |
| Fun_wholecommunity.soil_moisture_m.std | 1 | 5.695 | 0.034 | 16.969 | 0.001 | Fun_wholecommunity | soil_moisture |
| Fun_wholecommunity.SOM_m.std | 1 | 6.082 | 0.036 | 18.167 | 0.001 | Fun_wholecommunity | SOM |
| Fun_wholecommunity.ammonium_m.std | 1 | 6.447 | 0.039 | 19.303 | 0.001 | Fun_wholecommunity | ammonium |
| Fun_wholecommunity.phos_m.std | 1 | 6.227 | 0.037 | 18.616 | 0.001 | Fun_wholecommunity | phos |
| Climate | |||||||
| Fun_wholecommunity.ppt3yr_m.std | 1 | 6.046 | 0.036 | 18.057 | 0.001 | Fun_wholecommunity | ppt3yr |
| Fun_wholecommunity.GDD3yr_m.std | 1 | 2.574 | 0.015 | 7.526 | 0.001 | Fun_wholecommunity | GDD3yr |
| Plant | |||||||
| Fun_wholecommunity.avg_SRL_m.std | 1 | 1.847 | 0.011 | 5.375 | 0.001 | Fun_wholecommunity | avg_SRL |
| Fun_wholecommunity.avg_SLA_m.std | 1 | 3.154 | 0.019 | 9.253 | 0.001 | Fun_wholecommunity | avg_SLA |
| Fun_wholecommunity.herbivory_perc_m.std | 1 | 2.380 | 0.014 | 6.949 | 0.001 | Fun_wholecommunity | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F1.pH_m.std | 1 | 1.537 | 0.063 | 12.119 | 0.001 | Fun_wc_F1 | pH |
| Fun_wc_F1.soil_moisture_m.std | 1 | 1.203 | 0.049 | 9.347 | 0.001 | Fun_wc_F1 | soil_moisture |
| Fun_wc_F1.SOM_m.std | 1 | 1.529 | 0.062 | 12.055 | 0.001 | Fun_wc_F1 | SOM |
| Fun_wc_F1.ammonium_m.std | 1 | 1.193 | 0.049 | 9.271 | 0.001 | Fun_wc_F1 | ammonium |
| Fun_wc_F1.phos_m.std | 1 | 1.230 | 0.050 | 9.572 | 0.001 | Fun_wc_F1 | phos |
| Climate | |||||||
| Fun_wc_F1.ppt3yr_m.std | 1 | 1.863 | 0.076 | 14.898 | 0.001 | Fun_wc_F1 | ppt3yr |
| Fun_wc_F1.GDD3yr_m.std | 1 | 1.677 | 0.068 | 13.301 | 0.001 | Fun_wc_F1 | GDD3yr |
| Plant | |||||||
| Fun_wc_F1.avg_SRL_m.std | 1 | 0.655 | 0.027 | 4.976 | 0.001 | Fun_wc_F1 | avg_SRL |
| Fun_wc_F1.avg_SLA_m.std | 1 | 1.076 | 0.044 | 8.319 | 0.001 | Fun_wc_F1 | avg_SLA |
| Fun_wc_F1.herbivory_perc_m.std | 1 | 0.578 | 0.024 | 4.373 | 0.001 | Fun_wc_F1 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F3.pH_m.std | 1 | 1.664 | 0.223 | 7.751 | 0.001 | Fun_wc_F3 | pH |
| Fun_wc_F3.soil_moisture_m.std | 1 | 0.759 | 0.102 | 3.059 | 0.002 | Fun_wc_F3 | soil_moisture |
| Fun_wc_F3.SOM_m.std | 1 | 1.627 | 0.218 | 7.534 | 0.001 | Fun_wc_F3 | SOM |
| Fun_wc_F3.ammonium_m.std | 1 | 0.426 | 0.057 | 1.636 | 0.060 | Fun_wc_F3 | ammonium |
| Fun_wc_F3.phos_m.std | 1 | 0.948 | 0.127 | 3.931 | 0.001 | Fun_wc_F3 | phos |
| Climate | |||||||
| Fun_wc_F3.ppt3yr_m.std | 1 | 1.676 | 0.225 | 7.827 | 0.001 | Fun_wc_F3 | ppt3yr |
| Fun_wc_F3.GDD3yr_m.std | 1 | 1.676 | 0.225 | 7.827 | 0.001 | Fun_wc_F3 | GDD3yr |
| Plant | |||||||
| Fun_wc_F3.avg_SRL_m.std | 1 | 1.488 | 0.200 | 6.730 | 0.001 | Fun_wc_F3 | avg_SRL |
| Fun_wc_F3.avg_SLA_m.std | 1 | 1.670 | 0.224 | 7.789 | 0.001 | Fun_wc_F3 | avg_SLA |
| Fun_wc_F3.herbivory_perc_m.std | 1 | 0.645 | 0.087 | 2.557 | 0.005 | Fun_wc_F3 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F4.pH_m.std | 1 | 0.455 | 0.067 | 1.929 | 0.002 | Fun_wc_F4 | pH |
| Fun_wc_F4.soil_moisture_m.std | 1 | 0.352 | 0.052 | 1.469 | 0.012 | Fun_wc_F4 | soil_moisture |
| Fun_wc_F4.SOM_m.std | 1 | 0.483 | 0.071 | 2.060 | 0.002 | Fun_wc_F4 | SOM |
| Fun_wc_F4.ammonium_m.std | 1 | 0.352 | 0.052 | 1.469 | 0.009 | Fun_wc_F4 | ammonium |
| Fun_wc_F4.phos_m.std | 1 | 0.358 | 0.052 | 1.495 | 0.010 | Fun_wc_F4 | phos |
| Climate | |||||||
| Fun_wc_F4.ppt3yr_m.std | 1 | 0.409 | 0.060 | 1.724 | 0.068 | Fun_wc_F4 | ppt3yr |
| Fun_wc_F4.GDD3yr_m.std | 1 | 0.396 | 0.058 | 1.666 | 0.013 | Fun_wc_F4 | GDD3yr |
| Plant | |||||||
| Fun_wc_F4.avg_SRL_m.std | 1 | 0.468 | 0.069 | 1.989 | 0.001 | Fun_wc_F4 | avg_SRL |
| Fun_wc_F4.avg_SLA_m.std | 1 | 0.379 | 0.056 | 1.590 | 0.006 | Fun_wc_F4 | avg_SLA |
| Fun_wc_F4.herbivory_perc_m.std | 1 | 0.430 | 0.063 | 1.820 | 0.005 | Fun_wc_F4 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F5.pH_m.std | 1 | 0.720 | 0.156 | 2.779 | 0.001 | Fun_wc_F5 | pH |
| Fun_wc_F5.soil_moisture_m.std | 1 | 0.645 | 0.140 | 2.442 | 0.001 | Fun_wc_F5 | soil_moisture |
| Fun_wc_F5.SOM_m.std | 1 | 0.707 | 0.154 | 2.722 | 0.001 | Fun_wc_F5 | SOM |
| Fun_wc_F5.ammonium_m.std | 1 | 0.673 | 0.146 | 2.565 | 0.001 | Fun_wc_F5 | ammonium |
| Fun_wc_F5.phos_m.std | 1 | 0.651 | 0.141 | 2.467 | 0.001 | Fun_wc_F5 | phos |
| Climate | |||||||
| Fun_wc_F5.ppt3yr_m.std | 1 | 0.368 | 0.080 | 1.303 | 0.119 | Fun_wc_F5 | ppt3yr |
| Fun_wc_F5.GDD3yr_m.std | 1 | 0.402 | 0.087 | 1.435 | 0.041 | Fun_wc_F5 | GDD3yr |
| Plant | |||||||
| Fun_wc_F5.avg_SRL_m.std | 1 | 0.397 | 0.086 | 1.414 | 0.053 | Fun_wc_F5 | avg_SRL |
| Fun_wc_F5.avg_SLA_m.std | 1 | 0.691 | 0.150 | 2.648 | 0.001 | Fun_wc_F5 | avg_SLA |
| Fun_wc_F5.herbivory_perc_m.std | 1 | 0.399 | 0.087 | 1.422 | 0.063 | Fun_wc_F5 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F6.pH_m.std | 1 | 0.738 | 0.085 | 2.797 | 0.001 | Fun_wc_F6 | pH |
| Fun_wc_F6.soil_moisture_m.std | 1 | 0.551 | 0.064 | 2.039 | 0.004 | Fun_wc_F6 | soil_moisture |
| Fun_wc_F6.SOM_m.std | 1 | 0.496 | 0.057 | 1.822 | 0.007 | Fun_wc_F6 | SOM |
| Fun_wc_F6.ammonium_m.std | 1 | 0.538 | 0.062 | 1.989 | 0.001 | Fun_wc_F6 | ammonium |
| Fun_wc_F6.phos_m.std | 1 | 0.395 | 0.046 | 1.435 | 0.030 | Fun_wc_F6 | phos |
| Climate | |||||||
| Fun_wc_F6.ppt3yr_m.std | 1 | 0.594 | 0.069 | 2.210 | 0.001 | Fun_wc_F6 | ppt3yr |
| Fun_wc_F6.GDD3yr_m.std | 1 | 1.088 | 0.126 | 4.313 | 0.001 | Fun_wc_F6 | GDD3yr |
| Plant | |||||||
| Fun_wc_F6.avg_SRL_m.std | 1 | 0.703 | 0.081 | 2.651 | 0.001 | Fun_wc_F6 | avg_SRL |
| Fun_wc_F6.avg_SLA_m.std | 1 | 0.395 | 0.046 | 1.435 | 0.035 | Fun_wc_F6 | avg_SLA |
| Fun_wc_F6.herbivory_perc_m.std | 1 | 0.561 | 0.065 | 2.078 | 0.001 | Fun_wc_F6 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F7.pH_m.std | 1 | 1.035 | 0.099 | 3.962 | 0.001 | Fun_wc_F7 | pH |
| Fun_wc_F7.soil_moisture_m.std | 1 | 0.753 | 0.072 | 2.802 | 0.001 | Fun_wc_F7 | soil_moisture |
| Fun_wc_F7.SOM_m.std | 1 | 1.198 | 0.115 | 4.670 | 0.001 | Fun_wc_F7 | SOM |
| Fun_wc_F7.ammonium_m.std | 1 | 0.582 | 0.056 | 2.125 | 0.001 | Fun_wc_F7 | ammonium |
| Fun_wc_F7.phos_m.std | 1 | 0.362 | 0.035 | 1.293 | 0.107 | Fun_wc_F7 | phos |
| Climate | |||||||
| Fun_wc_F7.ppt3yr_m.std | 1 | 1.199 | 0.115 | 4.671 | 0.001 | Fun_wc_F7 | ppt3yr |
| Fun_wc_F7.GDD3yr_m.std | 1 | 1.008 | 0.097 | 3.848 | 0.001 | Fun_wc_F7 | GDD3yr |
| Plant | |||||||
| Fun_wc_F7.avg_SRL_m.std | 1 | 0.801 | 0.077 | 2.993 | 0.001 | Fun_wc_F7 | avg_SRL |
| Fun_wc_F7.avg_SLA_m.std | 1 | 0.790 | 0.076 | 2.948 | 0.001 | Fun_wc_F7 | avg_SLA |
| Fun_wc_F7.herbivory_perc_m.std | 1 | 0.622 | 0.060 | 2.283 | 0.001 | Fun_wc_F7 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F8.pH_m.std | 1 | 1.317 | 0.076 | 4.176 | 0.001 | Fun_wc_F8 | pH |
| Fun_wc_F8.soil_moisture_m.std | 1 | 1.244 | 0.071 | 3.927 | 0.001 | Fun_wc_F8 | soil_moisture |
| Fun_wc_F8.SOM_m.std | 1 | 1.586 | 0.091 | 5.116 | 0.001 | Fun_wc_F8 | SOM |
| Fun_wc_F8.ammonium_m.std | 1 | 1.038 | 0.060 | 3.235 | 0.001 | Fun_wc_F8 | ammonium |
| Fun_wc_F8.phos_m.std | 1 | 1.135 | 0.065 | 3.558 | 0.001 | Fun_wc_F8 | phos |
| Climate | |||||||
| Fun_wc_F8.ppt3yr_m.std | 1 | 1.924 | 0.111 | 6.343 | 0.001 | Fun_wc_F8 | ppt3yr |
| Fun_wc_F8.GDD3yr_m.std | 1 | 1.370 | 0.079 | 4.358 | 0.001 | Fun_wc_F8 | GDD3yr |
| Plant | |||||||
| Fun_wc_F8.avg_SRL_m.std | 1 | 0.891 | 0.051 | 2.754 | 0.001 | Fun_wc_F8 | avg_SRL |
| Fun_wc_F8.avg_SLA_m.std | 1 | 1.066 | 0.061 | 3.330 | 0.001 | Fun_wc_F8 | avg_SLA |
| Fun_wc_F8.herbivory_perc_m.std | 1 | 1.116 | 0.064 | 3.495 | 0.001 | Fun_wc_F8 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F9.pH_m.std | 1 | 0.727 | 0.041 | 2.328 | 0.001 | Fun_wc_F9 | pH |
| Fun_wc_F9.soil_moisture_m.std | 1 | 0.928 | 0.053 | 3.005 | 0.001 | Fun_wc_F9 | soil_moisture |
| Fun_wc_F9.SOM_m.std | 1 | 1.340 | 0.076 | 4.450 | 0.001 | Fun_wc_F9 | SOM |
| Fun_wc_F9.ammonium_m.std | 1 | 0.914 | 0.052 | 2.957 | 0.001 | Fun_wc_F9 | ammonium |
| Fun_wc_F9.phos_m.std | 1 | 0.808 | 0.046 | 2.599 | 0.001 | Fun_wc_F9 | phos |
| Climate | |||||||
| Fun_wc_F9.ppt3yr_m.std | 1 | 1.227 | 0.070 | 4.048 | 0.001 | Fun_wc_F9 | ppt3yr |
| Fun_wc_F9.GDD3yr_m.std | 1 | 1.064 | 0.060 | 3.473 | 0.001 | Fun_wc_F9 | GDD3yr |
| Plant | |||||||
| Fun_wc_F9.avg_SRL_m.std | 1 | 0.650 | 0.037 | 2.071 | 0.001 | Fun_wc_F9 | avg_SRL |
| Fun_wc_F9.avg_SLA_m.std | 1 | 0.703 | 0.040 | 2.246 | 0.001 | Fun_wc_F9 | avg_SLA |
| Fun_wc_F9.herbivory_perc_m.std | 1 | 0.789 | 0.045 | 2.534 | 0.001 | Fun_wc_F9 | herbivory_perc |
| Df | SumOfSqs | R2 | F | Pr(>F) | comm | expl_var | |
|---|---|---|---|---|---|---|---|
| Edaphic | |||||||
| Fun_wc_F10.pH_m.std | 1 | 0.868 | 0.074 | 2.815 | 0.001 | Fun_wc_F10 | pH |
| Fun_wc_F10.soil_moisture_m.std | 1 | 1.072 | 0.092 | 3.541 | 0.001 | Fun_wc_F10 | soil_moisture |
| Fun_wc_F10.SOM_m.std | 1 | 0.970 | 0.083 | 3.175 | 0.001 | Fun_wc_F10 | SOM |
| Fun_wc_F10.ammonium_m.std | 1 | 1.097 | 0.094 | 3.635 | 0.001 | Fun_wc_F10 | ammonium |
| Fun_wc_F10.phos_m.std | 1 | 1.113 | 0.095 | 3.693 | 0.001 | Fun_wc_F10 | phos |
| Climate | |||||||
| Fun_wc_F10.ppt3yr_m.std | 1 | 0.753 | 0.065 | 2.416 | 0.001 | Fun_wc_F10 | ppt3yr |
| Fun_wc_F10.GDD3yr_m.std | 1 | 0.797 | 0.068 | 2.567 | 0.001 | Fun_wc_F10 | GDD3yr |
| Plant | |||||||
| Fun_wc_F10.avg_SRL_m.std | 1 | 0.738 | 0.063 | 2.362 | 0.001 | Fun_wc_F10 | avg_SRL |
| Fun_wc_F10.avg_SLA_m.std | 1 | 0.768 | 0.066 | 2.466 | 0.001 | Fun_wc_F10 | avg_SLA |
| Fun_wc_F10.herbivory_perc_m.std | 1 | 0.814 | 0.070 | 2.626 | 0.001 | Fun_wc_F10 | herbivory_perc |
TBD
[in-progress]
| Cluster | total n | Site(s) | Grass(es) | Characteristics† | Exclusivity | |
|---|---|---|---|---|---|---|
| 1 | 183 | spans pH range | ||||
| 2 | 37 | BNP,DMT,FMT,SEV | BOER (n=33), BOGR (n=4) | high pH* (6.8 - 8.3, mean 7.5) | ||
| 3 | 56 | high pH**** (6.8 - 8.3, mean 7.8) | ||||
| 4 | 17 | lower pH**** (5.6 - 7.6, mean 6.2) | ||||
| 5 | 43 | high pH** (7.1 - 7.8, mean 7.6) | ||||
| 6 | 32 | lower pH**** (5.1 - 7.8, mean 6.2) | ||||
| 7 | 38 | |||||
| 8 | 29 | mostly LAR (n=27) | high pH**** (7.2 - 8.2, mean 8.0) | |||
| 9 | 9 | SFA | SCSC (only grass present) | Site=SFA | ||
| 10 | 29 | KAE | lower pH**** (5.9 - 7.2, mean 6.3) | Site=KAE (of the 32 KAE samples, only 3 others were in diff clusters) |
Clusters: Sørensen dissimilarity clusters based on pam (k = 10)
Method: R package randomForest v4.7.1.1
predictors.all<-t(otu_table(Fun_wholecommunity))
response.clus_sor_k10<-as.factor(sample_data(Fun_wholecommunity)$clus_sor_k10_new)
rf.data.clus_sor_k10<-data.frame(response.clus_sor_k10, predictors.all)
classify.clus_sor_k10<-randomForest(response.clus_sor_k10~., data = rf.data.clus_sor_k10, ntree=999)
Call: randomForest(formula = response.clus_sor_k10 ~ ., data = rf.data.clus_sor_k10, ntree = 999)
Type of random forest: classification
Number of trees: 999
No. of variables tried at each split: 81
OOB estimate of error rate: 6.82%
| F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | F10 | Cluster error % | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| F1 | 183 | 0.0 | |||||||||
| F2 | 10 | 0.0 | |||||||||
| F3 | 28 | 1 | 3.4 | ||||||||
| F4 | 1 | 27 | 1 | 6.9 | |||||||
| F5 | 12 | 1 | 4 | 29.4 | |||||||
| F6 | 23 | 2 | 7 | 28.1 | |||||||
| F7 | 1 | 2 | 34 | 1 | 10.5 | ||||||
| F8 | 1 | 1 | 1 | 50 | 5.7 | ||||||
| F9 | 1 | 1 | 49 | 5 | 12.5 | ||||||
| F10 | 2 | 35 | 5.4 |
adonis2
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = phyloseq::distance(Fun_wholecommunity, method = "bray") ~ clus_sor_k10_new, data = (as(sample_data(Fun_wholecommunity), "data.frame")))
## Df SumOfSqs R2 F Pr(>F)
## clus_sor_k10_new 9 40.489 0.20715 13.761 0.001 ***
## Residual 474 154.968 0.79285
## Total 483 195.457 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Fungal whole community
Clusters 2 & 3 seem to be the outgroup / most distantly related
Clusters 1 & 9 are very similar, followed by 10 -> wrap-around cluster numbering
average - Species contribution to average between-group dissimilarity
cusum - Ordered cumulative contribution (to between-group dissimilarity). These are based on item average, but they sum up to total 1
p - Permutation \(p\)-value. Probability of getting a larger or equal average contribution in random permutation of the group factor
Mean Sørensen dissimilarity between Clusters 1 & 2: 0.920 (unweighted) / 0.903 (weighted by sample size)
Mean within-cluster similarities
- Cluster 1: 0.514
- Cluster 2: 0.802